NOKMeans: Non-Orthogonal K-means Hashing
نویسندگان
چکیده
Finding nearest neighbor points in a large scale high dimensional data set is of wide interest in computer vision. One popular and efficient approach is to encode each data point as a binary code in Hamming space using separating hyperplanes. One condition which is often implicitly assumed is that the separating hyperplanes should be mutually orthogonal. With the aim of increasing the representation capability of the hyperplanes when used for indexing, we relax the orthogonality assumption without forsaking the alternate view of using cluster centers to represent the indexing partitions. This is achieved by viewing the data points in a space determined by their distances to the hyperplanes. We show that the proposed method is superior to existing state-of-the-art techniques on several large computer vision datasets.
منابع مشابه
Bounds for Resilient Functions and Orthogonal Arrays Extended Abstract
Orthogonal arrays (OAs) are basic combinatorial structures, which appear under various disguises in cryptology and the theory of algorithms. Among their applications are universal hashing, authentica-tion codes, resilient and correlation-immune functions, derandomization of algorithms, and perfect local randomizers. In this paper, we give new bounds on the size of orthogonal arrays using Delsar...
متن کاملDeep Multimodal Hashing with Orthogonal Units
Hashing is an important method for performing efficient similarity search. With the explosive growth of multimodal data, how to learn hashing-based compact representations for multimodal data becomes highly non-trivial. Compared with shallowstructured models, deep models present superiority in capturing multimodal correlations due to their high nonlinearity. However, in order to make the learne...
متن کاملDeep Multimodal Hashing with Orthogonal Regularization
Hashing is an important method for performing efficient similarity search. With the explosive growth of multimodal data, how to learn hashing-based compact representations for multimodal data becomes highly non-trivial. Compared with shallowstructured models, deep models present superiority in capturing multimodal correlations due to their high nonlinearity. However, in order to make the learne...
متن کاملOptimal Linear Hashing Files for Orthogonal Range Retrieval
In this paper, we are concerned with the problem of designing optimal linear hashing files for orthogonal range retrieval. Through the study of performance expressions, we show that optimal basic linear hashing files and optimal recursive linear hashing files for orthogonal range retrieval can be produced, in certain cases, by a greedy method called the MMI (minimum marginal increase) method; a...
متن کاملK-means Clustering with Feature Hashing
One of the major problems of K-means is that one must use dense vectors for its centroids, and therefore it is infeasible to store such huge vectors in memory when the feature space is high-dimensional. We address this issue by using feature hashing (Weinberger et al., 2009), a dimension-reduction technique, which can reduce the size of dense vectors while retaining sparsity of sparse vectors. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014